NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Multi-task Representation Learning for Fixed Budget Pure-Exploration in Linear and Bilinear Bandits

Mukherjee, Subhojyoti; Xie, Qiaomin; Nowak, Robert D (August 2025, Reinforcement Learning Journal)

Free, publicly-accessible full text available August 8, 2026
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning

Mukherjee, Subhojyoti; Hanna, Josiah P; Xie, Qiaomin; Nowak, Robert D (August 2025, Reinforcement Learning Journal)

Free, publicly-accessible full text available August 8, 2026
Weighted variation spaces and approximation by shallow ReLU networks

https://doi.org/10.1016/j.acha.2024.101713

DeVore, Ronald; Nowak, Robert D; Parhi, Rahul; Siegel, J W (January 2025, Applied and Computational Harmonic Analysis)

Full Text Available
ReLUs Are Sufficient for Learning Implicit Neural Representations

Shenouda, Joseph; Zhou, Yamin; Nowak, Robert D (May 2024, International Conference on Machine Learning)

Motivated by the growing theoretical understanding of neural networks that employ the Rectified Linear Unit (ReLU) as their activation function, we revisit the use of ReLU activation functions for learning implicit neural representations (INRs). Inspired by second order B-spline wavelets, we incorporate a set of simple constraints to the ReLU neurons in each layer of a deep neural network (DNN) to remedy the spectral bias. This in turn enables its use for various INR tasks. Empirically, we demonstrate that, contrary to popular belief, one can learn state-of-the-art INRs based on a DNN composed of only ReLU neurons. Next, by leveraging recent theoretical works which characterize the kinds of functions ReLU neural networks learn, we provide a way to quantify the regularity of the learned function. This offers a principled approach to selecting the hyperparameters in INR architectures. We substantiate our claims through experiments in signal representation, super resolution, and computed tomography, demonstrating the versatility and effectiveness of our method. The code for all experiments can be found at https://github.com/joeshenouda/relu-inrs.
more » « less
Full Text Available
SaVeR: Optimal Data Collection Strategy for Safe Policy Evaluation in Tabular MDP

Mukherjee, Subhojyoti; Hanna, Josiah P; Nowak, Robert D (May 2024, International Conference on Machine Learning)

In this paper, we study safe data collection for the purpose of policy evaluation in tabular Markov decision processes (MDPs). In policy evaluation, we are given a target policy and asked to estimate the expected cumulative reward it will obtain. Policy evaluation requires data and we are interested in the question of what behavior policy should collect the data for the most accurate evaluation of the target policy. While prior work has considered behavior policy selection, in this paper, we additionally consider a safety constraint on the behavior policy. Namely, we assume there exists a known default policy that incurs a particular expected cost when run and we enforce that the cumulative cost of all behavior policies ran is better than a constant factor of the cost that would be incurred had we always run the default policy. We first show that there exists a class of intractable MDPs where no safe oracle algorithm with knowledge about problem parameters can efficiently collect data and satisfy the safety constraints. We then define the tractability condition for an MDP such that a safe oracle algorithm can efficiently collect data and using that we prove the first lower bound for this setting. We then introduce an algorithm SaVeR for this problem that approximates the safe oracle algorithm and bound the finite-sample mean squared error of the algorithm while ensuring it satisfies the safety constraint. Finally, we show in simulations that SaVeR produces low MSE policy evaluation while satisfying the safety constraint.
more » « less
Full Text Available
On Penalty Methods for Nonconvex Bilevel Optimization and First-Order Stochastic Approximation

Kwon, Jeongyeol; Kwon, Dohyun; Wright, Stephen; Nowak, Robert D (January 2024, International Conference on Learning Representations)

In this work, we study first-order algorithms for solving Bilevel Optimization (BO) where the objective functions are smooth but possibly nonconvex in both levels and the variables are restricted to closed convex sets. As a first step, we study the landscape of BO through the lens of penalty methods, in which the upper- and lower-level objectives are combined in a weighted sum with penalty parameter . In particular, we establish a strong connection between the penalty function and the hyper-objective by explicitly characterizing the conditions under which the values and derivatives of the two must be -close. A by-product of our analysis is the explicit formula for the gradient of hyper-objective when the lower-level problem has multiple solutions under minimal conditions, which could be of independent interest. Next, viewing the penalty formulation as -approximation of the original BO, we propose first-order algorithms that find an -stationary solution by optimizing the penalty formulation with . When the perturbed lower-level problem uniformly satisfies the {\it small-error} proximal error-bound (EB) condition, we propose a first-order algorithm that converges to an -stationary point of the penalty function using in total accesses to first-order stochastic gradient oracles. Under an additional assumption on stochastic oracles, we show that the algorithm can be implemented in a fully {\it single-loop} manner, {\it i.e.,} with samples per iteration, and achieves the improved oracle-complexity of .
more » « less
Full Text Available
Looped Transformers are Better at Learning Learning Algorithms

Yang, Liu; Lee, Kangwook; Nowak, Robert D; Papailiopoulos, Dimitris (January 2024, International Conference on Learning Representations)

Transformers have demonstrated effectiveness in in-context solving data-fitting problems from various (latent) models, as reported by Garg et al. (2022). However, the absence of an inherent iterative structure in the transformer architecture presents a challenge in emulating the iterative algorithms, which are commonly employed in traditional machine learning methods. To address this, we propose the utilization of looped transformer architecture and its associated training methodology, with the aim of incorporating iterative characteristics into the transformer architectures. Experimental results suggest that the looped transformer achieves performance comparable to the standard transformer in solving various data-fitting problems, while utilizing less than 10% of the parameter count.
more » « less
Full Text Available
Deep Learning Meets Sparse Regularization: A signal processing perspective

https://doi.org/10.1109/MSP.2023.3286988

Parhi, Rahul; Nowak, Robert D (September 2023, IEEE Signal Processing Magazine)

Full Text Available
A Continuous Transform for Localized Ridgelets

https://doi.org/10.1109/SampTA59647.2023.10301398

Shenouda, Joseph; Parhi, Rahul; Nowak, Robert D (July 2023, IEEE)

Full Text Available
Feed Two Birds with One Scone: Exploiting Wild Data for Both Out-of-Distribution Generalization and Detection

Bai, Haoyue; Canal, Gregory; Du, Xuefeng; Kwon, Jeongyeol; Nowak, Robert D; Li, Yixuan (August 2023, International Conference on Machine Learning)

Modern machine learning models deployed in the wild can encounter both covariate and semantic shifts, giving rise to the problems of out-of-distribution (OOD) generalization and OOD detection respectively. While both problems have received significant research attention lately, they have been pursued independently. This may not be surprising, since the two tasks have seemingly conflicting goals. This paper provides a new unified approach that is capable of simultaneously generalizing to covariate shifts while robustly detecting semantic shifts. We propose a margin-based learning framework that exploits freely available unlabeled data in the wild that captures the environmental test-time OOD distributions under both covariate and semantic shifts. We show both empirically and theoretically that the proposed margin constraint is the key to achieving both OOD generalization and detection. Extensive experiments show the superiority of our framework, outperforming competitive baselines that specialize in either OOD generalization or OOD detection. Code is publicly available at https://github.com/deeplearning-wisc/scone.
more » « less
Full Text Available

« Prev Next »

Search for: All records